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Abstract. The dual attainment of the Monge-Kantorovich transport problem is ana- 
lyzed in a general setting. The spaces X, Y are assumed to be polish and equipped with 
Borel probability measures fi and v. The transport cost function c : X X Y — > [0, oo] is 
assumed to be Borel measurable. We show that a dual optimizer always exists, provided 
we interpret it as a projective limit of certain finitely additive measures. Our methods 
are functional analytic and rely on Fenchel's perturbation technique. 



1. Introduction 

We consider the Monge-Kantorovich transport problem for Borel probability measures fx, v 
on polish spaces X, Y. See [Vil03, Vil09 for an excellent account of the theory of optimal 
transportation. The set II(/i, v) consists of all Monge-Kantorovich transport plans, that is, 
Borel probability measures onlxr - which have X-marginal /i and ^-marginal v. The 
transport costs associated to a transport plan tt are given by 



(c,7r) = / c(x,y)dw(x,y). (1) 

J XxY 

In most applications of the theory of optimal transport, the cost function c : X x Y — > [0, oo] 
is lower semicontinuous and only takes values in R + . But equation ([!} makes perfect sense 
if the [0, oo]-valued cost function only is Borel measurable. We therefore assume throughout 
this paper that c : X x Y — > [0, oo] is a Borel measurable function which may very well 
assume the value +oo for "many" (x, y) G X x Y. The subset {c = oo} of X x Y is a set of 
forbidden transitions. 

Optimal transport on the Wiener space |FU02l lFU04al lFU04bl IFU06] ) and on config- 



uration spaces |Dec08[ IDJS08] provide natural infinite dimensional settings where c takes 
infinite values. 

The (primal) Monge-Kantorovich problem is to determine the primal value 

P := inf{(c,7r) : tt G TV(fi,v)} (2) 

and to identify a primal optimizer tt G n(/^, v) which is also called an optimal transport 
plan. Clearly, without loss of generality this minimization can be performed among the finite 
transport plans, i.e. the infimum is taken over the plans tt G n(/Li, v) verifying (c, tt) < oo. 
The dual Monge-Kantorovich problem consists in determining 

D:=sup-|y (pdfi + J ip dv^j (3) 
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for (tp, ip) varying over the set of pairs of functions ip : X — > [—00, 00) and ip : Y — > [—00, 00) 
which are integrable, i.e. tp € L x (p), ip <E L (v), and satisfy tp © ip < c. We have denoted 
tp © ip(x, y) := tp(x) + ip(y), x G X, y e Y. 

We say that there is no duality gap if the primal value P of the problem equals the dual 
value D, there is primal attainment if there exists some optimal plan tt and there is integrable 
dual attainment if the above dual Monge-Kantorovich problem is attained for some (tp, ip). 
There is a long line of research on these questions, initiated already by Kantorovich (|Kan42j) 
himself and continued by numerous others (we mention [KR58, Dud76, Dud02. dA82. GR81, 
IFer811 ISzu821 IRR951 IRR961 IMik061 IMT06] . see also t he bib liographical notes in [Vil09l p 
86, 87]). Important progresses were done by Kellerer Kel84]. We also refer to the seminal 
paper |GM96j by Gangbo and McCann. Recently the authors of the present article have 
obtained in |BLS09a| a general duality result which is recalled below at Theorem ll.il 

It is well-known that there is primal attainment under the assumptions that c is lower 
semicontinuous and the primal value P is finite. On the other hand, it is easy to build 
examples where c is not lower semicontinuous and no primal minimizer exists. 

In this article we focus onto the question of the dual attainment. 

The dual optimizers (tp, ip) are sometimes called Kantorovich potentials. In the Euclidean 
case with a quadratic cost, it is well-known that these potentials are convex conjugate to 
each other and that any optimal plan is supported by the subdifferential of tp. In the general 
case, these potentials are c-conjugate to each other, a notion introduced by Riischendorf 
[Rus96| . 

Kellerer |Kel84[ Theorem 2.21] established that integrable dual attainment holds true in 
the case of bounded c. This was extended by Ambrosio and Pratelli [AP031 Theorem 3.2], 
who gave appropriate moment conditions on p and v which are sufficient to guarantee the 
existence of integrable dual optimizers. Easy examples show that one cannot expect that 
the dual problem admits integrable maximizers unless the cost function satisfies certain 
integrability conditions with respect to p and v |BS09( Examples 4.4, 4.5]. In fact [BS09, 
Example 4.5] takes place in a very "regular" setting, where c is squared Euclidean distance on 
R. In this case there exist natural candidates (tp, ip) for the dual optimizer which, however, 
fail to be dual maximizers in the usual sense as they are not integrable. 

The following solution was proposed in |BS09| Section 1.1]. If tp and ip are integrable 
functions and n G n(/i, v) then 



ipdp+ / ipdv= / tp (Sip dir. (4) 
x Jy JXxY 

If we drop the integrability condition on tp and ip, the left hand side need not make sense. 
But if we require that tp © ip < c and if tt is a finite cost transport plan, i.e. JV x v cd-7r < 00 ' 
then the right hand side of ^ still makes good sense, assuming possibly the value -co, and 
we set 

J c (tp,ip) = / tp^ipdn. 

J XxY 

It is not difficult to show (see [BS091 Lemma 1.1]) that this value does not depend on the 
choice of the finite cost transport plan 7r and satisfies J c (p, ip) < D. Under the assumption 
that there exists some finite transport plan, we then say that we have measurable dual 
attainment in the optimization problem (J3]) if there exist Borel measurable functions tp : 

X — > [—00, 00) and ip : Y — > [—00, 00) verifying tp © ip < c such that 

D = J c (p,iP). (5) 

In BS09, Theorem 2] it was shown that, for Borel measurable c : X x Y — > [0, 00] such 
that c < 00, p © ^-almost surely, there is no duality gap and there is measurable dual 
attainment in the sense of (0- 
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A necessary and sufficient condition for the measurable dual attainment was proved in 
BLSOQaJ Theorems 1.2 and 3.5]. We need some more notation to state this result below 
as TheoremO Fix < e < 1 and define LT^, v) = {tt G M\ xY , \\tt\\ > 1 - £,p x (n) < 
/j,,Py(t{) < v\ where A4 XxY denotes the non- negative Borel measures ion! x 7 with norm 
||7r|| = tt{X x Y). By px{^) < M (resp. py(k) < v) we mean that the projection of tt onto 
X (resp. onto Y) is dominated by fi (resp. v). We denote P e := inf {(c, tt) : 7r 6 II £ (^, i/)}. 
This partial transport problem has recently been studied by Caffarclli and McCann CM06 
as well as Figalli |Fig09| . In their work the emphasis is on a finer analysis of the Monge 
problem for the squared Euclidean distance on R™, and pertains to a fixed e > 0. In the 
present paper, we do not deal with these more subtle issues of the Monge problem and 
always remain in the realm of the Kantorovich problem ([5} . We call 

P rcl := lira P £ (6) 

the relaxed primal value of the transport plan. Obviously this limit exists (assuming possibly 
the value + oo) and P rcl < P. 

Theorem 1.1 (Measurable dual attainment [BLS09a ). Let X,Y be polish spaces, equipped 
with Borel probability measures [A, v , and let c : X x Y — » [0, oo] be Borel measurable. 

(a) There is no duality gap if the primal problem is defined in the relaxed form while the 
dual problem is formulated in its usual form ([3]). In other words, we have P lcl = D. 

(b) Assume that in addition there exists a finite transport plan tt G n(^i, v). The following 
statements are equivalent. 

(i) There is measurable dual attainment, i.e. there exist measurable functions tp,ip such 
that fi(Btp<c and P Icl = D = J c (<p,->p). 

(ii) There exists a [i ® v-a.s. finite function h : X x Y — >• [0, oo] such that P rcl = 
P cAh := inf{(c A h, tt) : tt G IL(p,v)}. 

The aim of the present paper is to go beyond the setting of this theorem where the 
measurable dual attainment is realized. We are going to discuss the existence of an optimizer 
of an extension of the dual problem @, without imposing any further conditions on the Borel 
measurable cost function c : X x Y — > [0, oo]. 

In Theorem l3.1l we take a somewhat unothodox view at the general optimization problem. 
We start with a transport plan ttq G n(^i, v) with finite cost, but which is not supposed to 
be optimal. We then optimize over all the transport plans tt G n(/i, v) such that the Radon- 
Nikodym derivative is bounded. In this setting we show that there is no duality gap and 
that there is a dual optimizer. However, this dual optimizer is not given by a pair of functions 
(f © "0) G i 1 ( 7r o), but rather as a weak star limit of a sequence (<p n © ip^^L-i G ^(ttq) 
in the bidual L 1 (7r )**. A rather elaborate example in the accompanying paper |BLS09b| 
shows that this passage to the bidual is indeed necessary, in general. 

While Theorem 13.11 depends on the choice of the finite transport plan 7r G n(/i, v), we 
formulate in Theorem 14.21 a result which docs not depend on this choice. There we pass to 
a projective limit along a net of finite transport plans. Again we can prove that there is no 
duality gap and can identify a dual optimizer. 

2. TWO TYPES OF ACCIDENT 

In this section, we point out some difficulties which arise when going one step beyond the 
measurable dual attainment. We shall face two types of troubles which might be called 

• measur ability accident; 

• singular concentration accident. 

Before describing these phenomena, it is worth recalling some results from |BS09j and |Leo09j 
about optimal plans. The proofs of the present paper and of Theorems 12.11 and 12.21 below 
rely on three different types of techniques. 
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About the optimal plans. The following characterization of the optimal plans was proved 
in [BS09] . 

Theorem 2.1 ( |BS09| Theorem 2]). A ssume that X, Y are polish spaces equipped with Borel 
probability measures \i, v, that c : X x Y — > [0, oo] is Borel measurable and /i © v-a.e. finite 
and that there exists a finite transport plan. 

(a) Let tt be a finite transport plan and assume that there exist measurable functions ip : 
X — > [— oo, oo) and ip . Y — > [— oo, oo) which satisfy 

ip © ip < c everywhere , , 

ip © ip — c ir-almost everywhere. 

Then J c (<p, ip) — ( c : thus tt is an optimal transport plan and ip, ip are dual maximizers 
in the sense of ([3]). 

(b) Assume that tt is an optimal transport plan. Then it verifies ([7]) for every pair (ip, ip) of 
dual maximizers in the sense of (0. 

As a definition which was introduced in jST09] , a transport plan tt is said to be strongly c- 
cyclically monotone if there exist measurable functions <p : X — >• [— oo, oo), ip : Y — > [— oo, oo) 
which satisfy ([7]). 

Denote by n(/i, v, c) the set of finite cost transport plans 

v, c) :— < 7r G n(/i, v) : / cc?7r < oo > , 
I JXxY ) 

and say that a property holds is, c)-almost everywhere if it holds true outside a mea- 
surable set N such that ir(N) — 0, for all tt e H((J., v, c). 

In |Leo09| , the assumption that c is /i <8> ^-a.e. finite was removed under the extra require- 
ment that c is lower semicontinuous and the following analogous results were obtained. 

Theorem 2.2 ([LcoOQ]). A ssume that X, Y are polish spaces equipped with Borel probability 
measures \i, v, that c : X x Y — >• [0, oo] is lower semicontinuous and that there exists a finite 
transport plan. 

(a) Let it be a finite plan and assume that there exist measurable functions ip : X — > [—00, oo) 
and ip : Y — > [— oo, oo) which satisfy 

ip ®ip < c II(/i, v, c)-almost everywhere . . 

(p © ip — c ir-almost everywhere. 

Then J c (<p, "0) = (c, 7r), thus 7r is an optimal transport plan and ip, tp are dual maximizers 
in the sense of ([5]). 

(b) Take any optimal plan tt, e > and ix any probability measure on X xY such that 
IxxY c( ^ 7r o < oo- Then, there exist functions h £ L 1 (7T + ir ), ip and ip bounded contin- 
uous on X and Y respectively and a measurable subset Z e C (X x Y) such that 

(i) h — c, tt -almost everywhere on (X x Y) \ Z t ; 

(ii) J Zc (l + cP)dn<e; 

(Hi) — c/e < h < c, (7r + TT )-almost everywhere; 
(iv) —c/e < ip © ip < c, everywhere; 
(v) \\h - ip ® i>\\ L x^ + n ) < £• 

As regards (a), the examples [BGMS09, Example 5.1] and BS09 ( Example 4.2] exhibit 
optimal plans which are not strongly c-cyclically monotone but which satisfy the weaker 
property ([8]). As regards (b), let us emphasize the appearance of the probability measure ir 
in items (Hi) and (v). One can read (iii-v) as an approximation of (p © tp < c, (tt + 7r )-a.e. 
Since it is required that J XxY c dir < oo, one can choose tt in H(/j, v, c), and the properties 
(i-v) are an approximation of © where n(/x, v, c)-a.e. is replaced by the weaker (7r + 7r )-a.e. 
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Note also that for any (ip, ip) verifying ([7]) or ([5} with 7r € n(/i, i/, c), we have 

/i((/3 = — oo) = v(ip = — oo) = 0. 



(9) 



As a consequence of this remark and a result of Kellerer jKe!84j . see BLS09a, Lemma A.l], 
we can replace"^ © ip < c everywhere" in by "y? © ?/> < c, ^)-almost everywhere." 
The comparison between ([7]) and ([8]) becomes clearer. 

Measurability accident. To develop a feeling for what we are after, we consider a specific 
example. 

Example 2.3 (Ambrosio-Pratelli, [AP031 Example 3.2]). Let X = Y = [0,1), equipped 
with Lebesgue measure A = [i = v. Pick a £ [0, 1) irrational. Set 



This cost function is a variation on AP03 's original example which has been proposed in 
BS09, Example 4.3]. For i = 0,1, let 7Tj be the obvious transport plan supported by Pj. 
Following the arguments of [AP03] . it is easy to see that all finite transport plans are given 
by convex combinations of the form piro + (1 — p)^ii P G [0, 1] and each of these transport 
plans leads to costs of 1. 

Note that since c is lower semicontinuous, there is no duality gap. This was proved in |Kel84j 
and is an easy consequence of Theorem II .11 - fa). Thus, for each e > 0, there are integrable 
functions ip, ip : [0, 1) — > [-co, oo) such that ip © ip < c and < J(c — <p © ip) dni < e for 
£ = 0,1. 

On the other hand, it is shown in BS09] that there do not exist measurable functions 
ip, ip '■ [0, 1) — > [— oo, oo) satisfying cp © ip < c such that cp © ip = c holds no- as well as 
7Ti-almost surely. 

Let us have a closer look at the previous example: while it is not possible to find Borcl 
measurable limits ip, ip of an optimizing sequence ((p n ,ip n )^ =1 , it is possible to find a limiting 
Borel function h(x, y) of the sequence of functions (tp n (x) + ip n {y))^Li on the set {(x, y) £ 
X x Y : c(x,y) < oo}. Indeed, on this set, which simply equals To U Ti, any optimizing 
sequence ((p n (x) +ip n (y))'%L 1 for ([3]) has a subsequence which converges 7r-a.s. to h(x,y) := 
c(x,y), for any finite cost transport plan tt. 

Summing up: in the context of the previous example, there is a Borel function h(x, y) on 
X x Y, which equals c(x, y) on Tq U T\; it may take any value on [X X Y) \ (Tq U Ti), e.g. 
the value +oo. This function h(x,y) may be considered as a kind of dual optimizer: it is, 
for any finite cost transport plan n, the limit of an optimizing sequence (ip n (x) + ip n (y))^Li 
with respect to the norm || • Hl 1 ^)- 

Singular concentration accident. One can rewrite the sufficient conditions of Theorems 
l2.H -fa~) and !2.2K a) as follows: n and (<p,ip) solve the primal and dual problems if fc G 
II(/x, c), (ip © ip)n = C7r and (ip © ip)n < or, \/ir € n(/i, v, c), in the space of bounded 
measures. In view of Example 12.31 and of part (b) of Theorem 12. 2\ we are aware that <p®ip 
should be replaced by a jointly measurable h such that for each 7r £ n(/i, v, c), hir can be 
approximated in variation norm by a sequence ((<p n © 4'n)'^)^Li verifying (<p n © ip n )iT < ctt 
for all n > 1. But this is not the end of the story. 

In the accompanying paper [BLS09b] , rather elaborate extensions of the above example 
are analyzed. By means of examples (which are too long to be recalled here), it is shown 



r = {(x,x) : x £ X} Fx = {(x,x © a) : x 6 X}, 



where © is addition modulo 1. Define c : X x Y — > [0, oo] by 




oo else 



1 for (x,y) £ T 

2 for (x,y) £ T^x £ [0,1/2) 
for (x,y) £T 1 ,x£ [1/2,1) 



6 MATHIAS BEIGLBOCK, CHRISTIAN LEONARD, AND WALTER SCHACHERMAYER 

that instead of the functions or, equivalently, countably additive measures hw, one has to 
consider finitely additive measures. This might be seen as a consequence of the limiting 
behavior of functions tp © ip tending to — oo somewhere, under the seemingly contradictory 
requirement (J9j> . 

3. Existence of a dual optimizer 

The remainder of this article is devoted to developing a theory which makes this circle of 
ideas precise in the general setting of Borel measurable cost functions c : X x Y — > [0, oo]. 
To do so we shall apply Fenchel's perturbation method as in [BLS09a] . In addition, we need 
some functional analytic machinery, in particular we shall use the space (L 1 )** — (L°°)* of 
finitely additive measures. 

Assume n(/Lt, v,c) ^ to avoid the trivial case. 

We fix 7To £ n(/i, v, c) and stress that we do not assume that ttq has minimal transport 
cost. In fact, there is little reason in the present setting (where c is not assumed to be 
lower semicontinuous) why a primal optimizer n should exist. We denote by n( 7r °) (//, v) 
the set of elements 7r £ II(u, v) such that n <^ nn and II 4^-11 r „, \ < oo. Note that 

vr ' ' u II d7TQ lli^°°(7ro) 

u^(fi, v) = n(/i, v) n L°°(7ro) c no*, v, c). 

We shall replace the usual Kantorovich optimization problem over the set n(/i, v, c) by 
the optimization over the smaller set H^°'(fj,,i/) and consider 

P {7U3) = inf{(c,7r) = fcdir : tt ellWfei/)}. (10) 

As regards the dual problem, we define for e > 0, 

D^°' £) = sup{ J tpdfi + J ipdv : tp G L X (m)> G ^(v), 

(tp © ip — c)+ d-KQ < e > and 

XxY ' 

D {7Ia) = lim£K 7ro ' E) . (11) 

Define the "summing" map S by 

S : L\X,fi) x l}(Y,v) -> L l (X x Y,ir ) 

(if,ip)t-^ip®ip 

and denote by Lg(X x F,7To) the || . || i-closed linear subspace of L 1 (X x Y,iro) spanned by 
S(L 1 (X,/i) x L 1 (Y,i/)). Clearly Lg(X x F,7To) is a Banach space under the norm ||.||i 
induced by L X (X x Y, ttq). 

We shall also need the bi-dual Lg(X x Y, tto)** which may be identified with a subspace 
of L (X x y,7To)**. In particular, an element h £ £g(A~ x y,7To)** can be decomposed into 
h = h r + h s , where h r £L'(Ix I^tto) is the regular part of the finitely additive measure 
h and h s its purely singular part. Note that it may happen that h £ L\(X x Y ,tto)** while 
h r & Lg(X x F,7r ), and therefore also h s £ L l s (X x F,7r )**. 

Theorem 3.1. Let c : X x Y — > [0, oo] be Borel measurable and let ttq £ n(/i, v, c) be a 
finite transport plan. We have 

pOo) _ £)(iro) < (]_2) 

There is an element h £ Lg(X x Y^ttq)** which verifies the inequality h < c in the Banach 
lattice L X (X x Y, ir )** and 

D {lXo) = (h,ir ). 



^The inequality h < c pertains to the lattice order of L X (X X Y)** , where we identify the 7ro-integrable 
function c with an element of L (X X Y, 7ro)** . If h decomposes into h = h r + h s , the inequality h < c holds 
true if and only if h r (x,y) < c(x,y), 7ro-a.s. and /i s < (compare the discussion after 118t ) 
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If 7T € Il^^/i, Z/) (identifying n with satisfies f cdn < P^°^> + a for some number 

a > 0, then 

-a<{h s ,ii)<0. (13) 
In particular, if it is an optimizer of U0\) . then h s vanishes on the set > 0}. 
In addition, we may find a sequence of elements (<p n , ipn) € L {fi) x I^W) such that 

fn © Ipn -» ^) TTo-a.S., 

© ~ ^ r )+IUi(7 ro ) ^ arlrf 

lim sup lim -(((/?„ ©Vn)lA,7ro) = ||^||Li(7ro)"- ( 14 ) 

Proof. It is straightforward to verify the trivial duality relation D^°) < P^°\ To show the 
reverse inequality and to find the dual optimizer h £ L 1 (A" x Y,tt )**, as in Bl„S()T)aj we 
apply W. Fenchel's perturbation argument. (For an elementary treatment, compare also 
BLS09h.) The summing map S factors through Lg(iro) as indicated in the subsequent 
diagram: 

L 1 ^) x L\u) A L\-k ) 
Si s 2 

\ / 

Then Si has dense range and S2 is an isometric embedding. Denote by ^Lq(ttq)* , ||.||x,i (^0)*) 
the dual of L^wo) which is a quotient space of -L°°(7ro). Transposing the above diagram we 
get 

x L°°(i>) L°°(7ro) 

Ti T 2 

\ / 

where T,Ti,T2 are the transposed maps of S,S±, resp. S2. Clearly T(j) — (px(i),Py(i)) 
for 7 £ £°°(7To), where px,PY are the projections of a measure 7 (identified with the Radon- 
Nikodym-derivative -^-) onto its marginals. By elementary duality relations we have that 
T2 is a quotient map and Ti is injective; the latter fact allows us to identify the space 
Lg^o)* with a subspace of L°°(^i) x L°°(v). 

For example, consider the element 1 £ L°°(7ro), which corresponds to the measure ttq on 
XxY. The element T 2 (l) £ T^(7r )* may then be identified with the element (1, 1) = T(l) 
in L°°(p) x _L°°(j/) which corresponds to the pair (/1, ^). We take the liberty to henceforth 
denote this element simply by 1 , independently of whether we consider it as an element of 
L°°(7r ), L\{^Y or L°°( M ) x L°°(y). 

We may now rephrase the primal problem (|10p as 

(c,7)=/ c(x, y)d'j(x,y) -> min, 7 £ (7T ), 

under the constraint 

T( 7 ) = 1. (15) 
The decisive trick is to replace (|15j) by the trivially equivalent constraint 

r 2 (7) = i, 

and to perform the Fenchel perturbation argument noi in the space L°°{p) x but 
rather in the subspace L^ttq)* which is endowed with a stronger norm. The map $: 

L sM* ->• [0,00], 

*(p) :=ujf{<c >7 ) :7 6l?N,T 2 (7)=p}, p £ X^ttq)*, 
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is convex, positively homogeneous and $(1) = P^°\ 

Claim. There is a neighbourhood V of 1 in L^(7To)* on which $ is bounded. 

Indeed, let U = {7 £ L°°(7ro) | [[7 — l|| iO o( W0 ) < i}. Then U is contained in the positive 

orthant L+(vr ) of i°°(7r ) and 

$(T 2 (7)) < (c, 7 ) < U4l^ ) for all 7 £ U. 

Hence on T 2 (U), which simply is the open ball of radius \ around 1 in the Banach space 
Ls(itq)*, we have that $ is bounded by |||c||£i („■„). 

It follows from elementary geometric facts that the convex function $ is continuous on 
T 2 (U) with respect to the norm of Lg(no)* . By Hahn-Banach there exists / £ Lg(-K )** such 
that 

</,l> = *(l), 

</,p)<$(p) for allpeL^M*. 

The adjoint T 2 * of T 2 maps Lg(7r )** isometrically onto a subspace E of L 1 (7r )** = 
L°°(7i"o)*. The space P consists of those elements of i 1 (7To)** which are tr*-limits of nets 
{f a © ip a )aei with ip a £ L 1 ^), -0 Q £ L 1 ^)- Write := T 2 (/). Then for all 7 £ i+ (tt ), 

(/j,7) = (T 2 *(/),7) = (/,T 2 (7)) < $(T 2 ( 7 )) < (c,7>, (16) 

and if tt £ L~(7r ), T 2 (7r) = 1 then 

(Kir) = (T 2 *(/),7r) - </,T 2 (tt)) = (/, 1) = $(1) - p(*°). (17) 

By p^|) , the inequality h < c holds true in the Banach-lattice L°°(7ro)*. Combining this 
with (|17p we obtain that h is a dual optimizer in the sense of 

DiZ o) := sup{( 5 ,^o) : .9 £ L^oT*, 9 < c 

in the Banach lattice L^tto)**} 

(where we identify no with the element 1 of L°°(7To)) and that there is no duality gap in 
this sense, i.e. DiZ o) = P^<>\ 

As mentioned above, every element g £ L 00 (710)* splits in a regular part g r lying in L 1 (7To) 
and a purely singular part g s . Given gi,g 2 £ L°°(tto)*, we have g\ < g 2 if and only if g\ < g 2 
and g{ < g 2 . Since c £ L 1 ^) we have c s = 0. The inequality h < c implies that h s < c s = 
and h r < c r — c. It follows that for each tt £ L^(ttq) 

(h r ,ir)<(c,ir). (19) 

Assume additionally that tt satisfies T 2 (tt) = 1 and choose a > such that (c, 7r) < P^'+a. 
Then (h, w) = P^ ) and subtracting this quantity from (IT91 we get 

(-ft s , tt) = - h,ir) < (c, tt) - P^ < a 

showing (flU)) . 

We still have to show the existence of a sequence ((/J n , V>n)«S=i satisfying the above asser- 
tions about convergence. So far we know that there is a net (y a , ip a )a£i such that ip a © ip a 
weak-star converges to h. First we claim that there exists a net (f a )aei of elements of 
L 1 (7To), such that ||/ a ]|i < \\h s \\, h r + f a € L<j(7r ) and ft r + f a -)• ft, in the <r*-topology. 
To see this, note that Alaoglu's theorem |RS80[ Theorem IV. 21] implies that in a Banach 
space V, the unit ball B\(V) is <r*-dense in the unit ball E>i(V**) of the bidual. Thus 
h r + ||ft, s ||Bi(Lg(7ro)) is cr*-dense in h r + ||/i s ||Pi(Ls(7ro)**) which yields the existence of a 
net (f a )aei a s required. 
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As h s is purely singular, we may find a sequence in I such that ||/ Q ,J| < \\h s \\ 

and / U n dir a = -\\h s \\ + 2"™, and that /(|/ J A 2") dir < 2~ n , which implies that the 
sequence (/ a „)n?=i converges 7To-a.s. to zero. 

As h r + f an £ Lg^TTo) we may find ((p n , tp n ) £ L 1 ^) x L 1 ^) such that 

||^©^n-(^ + /aJIUM-o) < 2 ~™ 

We then have that {<f n ®fpn)^Li converges 7r -a.s. to h r and that || (<p n ®il>n— h r ) + Hz, 1 ^) — ► 0. 

As regards assertion (fTi|) we note that, for Am = U^Lm+i{l/"J > 2~ n } we have 
n {A m ) < 2- m and 

liminf (-((<£„ © </Vi)lA m , tto}) = - limsup((/i r + / a JlA m ,7To) 

= -{h r l Aml ir ) - lim (/„ 1 A , 7r ) 
= -{h r lA m ,-K ) + \\h s \\ L i M **. 

Letting m tend to infinity we obtain that the left hand side of (Tl4)) is greater than or 
equal to the right hand side. As regards the reverse inequality it suffices to note that 

H/aJliVo) - II^IUhto)**- 

As h r < c, 7r -a.s., we obtain in particular that \\(<p n © ipn — c )+IU 1 (ir ) showing that 
£)("■<>) > p(fo) anc j therefore (fl2|) . the reverse inequality being straightforward. □ 



As a by-product of this proof, we have shown in (TTgj) that 

d{Z o) =Z)( 7r °) = p(*<0. (20) 

Admittedly, Theorem 13. II is rather abstract. However, we believe that it may be useful in 
applications to have the possibility to pass to some kind of limit h of an optimizing sequence 
((/?„, ipn)^Li i n the dual optimization problem, even if this limit is somewhat awkward. To 
develop some intuition for the message of Theorem 13. 1[ we shall illustrate the situation at 
the hand of some examples. 

Let us start with Example 12.31 In this case we may apply Theorem 13.11 to the finite 
transport plan tti = §(7To +tti), (we apologize for using ni instead of ttq in Theorem 13. II as 
the notation ttq is already taken). As we have seen above, there are sequences (ip n ®ip n )^=i 
converging 7ri-a.s. as well as in the norm of L (ttl) to h = c, as defined in Example 12.31 

above. In particular we do not have to bother about the singular part h s of h, as we have 
h = h r in this example. We find again that h represents the limit of (ifi n ®' l Pn) n a = i, considered 
as a Borel function on {c < oo} which is the support of ni . 

We now make the example a bit more interesting and challenging. (See Example 13.21 
below.) 

Fix in the context of Example 12.31 (where we now write c instead of c to keep the letter 
c free for a new function to be constructed) a sequence ((p n ,ip n )%Li such that ||c — (p n © 
'/'nll.L^Trt) - > for i = 0,1. We claim that (<p n © r/'nJjJLi converges in H-IIl 1 ^) where, for 
each k £ N, is the measure which is uniformly distributed on 



T k = {(x, x®ka) : x £ [0, 1)}. 



(21) 
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Let us prove this convergence whose precise statement is given below at ([2"o]) and ([27)) . We 
know 

<Pn(x) + ip n ( x ) c(x, x) and (22) 

ip n (x) + ip n (x ® a) — > c(x, x © a), whence 

ip n (x ® a) - ip n (x) -> c(x, x®a)- c(x, x) = |^1 for a; e [± !)' ^ 

=:g(x) 

Replacing x by x © ia, i — 1, . . . ,k — lin (|23[) this yields 

fc-i 

^„(a; © a) - ^n(^) ^ fl 1 ^ © 

i=0 

Combined with (|2"2"j) we have 

fe-i 

lim [v„(i) + V>«(^ © fca)] = 1 + Y g(x © ia) (24) 

n— too * — ' 

= 1 + # {0 < i < k : x © ia € [0, ±)} - # {0 < i < fc : X © 6 [5, 1)} 
=: p fe (x). (25) 
Dchne the function h on X x Y 

h(x, y) = ( pk ^ ^ y) G Ffc ' fc G N ' (26) 



■DC' 



By (f24|) . we have, for each k € N, lim ra ||ft — </?„© V , n||ii( 7rh ) = 0. Somewhat more precisely, 
one obtains that 

- (fin © Vn||£l(^ fc ) < fcp - <y5n © || L 1 (ttq+tti) ■ ( 27 ) 

Now we shall modify the cost function c of Example 12.31 by defining it to be finite not 
only on To U Ti, but rather on lJ fceN T^. We then obtain the following situation. 

Example 3.2. Using (J26J) define c : [0, 1) x [0, 1) ->■ [0, 00] by 

c(x,y) = h(x,y) + , 

so that {c < 00} = UfceN^ fc - -^ or ^ ne resulting optimal transport problem we then find: 

(i) The primal value P of the problem © equals zero and ip = ijj = are (trivial) 
optimizers of the dual problem ([3]). 

(ii) For strictly positive scalars (afc)fc>o, normalized by J2k>o a k = ^ apply Theorem 13. II 
to the transport plan ir := J2k>o ak7Tk - (Again we apologize for using the notation 
7r for the measure ttq in Theorem 13. 1[ as all the letters 7tk are already taken.) If 
(<ifc)>o tends sufficiently fast to zero, as |fc| — > 00, the following facts are verified. 

- The primal value is 

pW=inf( / cdn:neU{^u),\\§\\ L oo < col = 1. 

- The Borel function h 6 L 1 ^) defined in (f2l>|) is a dual optimizer in the sense 
of Theorem 13. 1[ i.e. 



D M = / hdw = 1. 

JXxY 

There is a sequence ((p n ,i/j n )^ =1 in L 1 (/i) x L 1 (^) such that (y>„ © i/j n )%Li 
converges to h in the norm of L 1 (7r). 



2 The equations i'2'2\i to j25j) refer to integrable functions on [0, 1) and convergence is understood to be 
with respect to |].||^if„v 
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Before proving the above assertions let us draw one conclusion: in (ii) we can not assert 
that the functions (ip n , ipn)^Li satisfy - in addition to the properties above - the inequality 
(Pn(x) + ipn{y) < c(x,y), for all (x, y) 6 X x Y. Indeed, if this were possible then, because 
of lim„_ ! . 00 (J x (p n d/i + f Y ip n dv) — = 1, we would have that the dual value D of the 
original dual problem (J3J) would equal D = 1, in contradiction to (i). 

Proof of the assertions of Example \3.2\ We start with assertion (ii) . Fix an optimizing se- 
quence ((f n ,ip n )'^L 1 in the context of Example 12.31 such that 

||C - <p„ © VnlU^o+Tn) ^ V" 3 - (28) 

Pick a sequence (at) ken of positive numbers such that 

(a) a k \\h\\ L ifr k) < C2- k for all k G N, 

(b) a fe (||^„||i + ||V>n||i) < C2- k for all k G N with n < fc, 

for some real constant C . After re-normalizing, if necessary, we may assume that Y]^_-, a& = 
1. Set 7r := YskLi a fe7Tfe- From (a) we obtain ft G L 1 ^) C L 1 (7r)** thus ft is viable for the 
problem and hence > 1. Clearly pW < 1, hence PW = = l and ft is a dual 
maximizer. Combining (|28p with f|27fl we obtain 

||ft - ifi n 8 ^n||ii(7T fc ) < fc /™ 3 - 

Therefore 

jfe < n fe > n 

< l/n + 2C^2-' £ . 

Hence tp n ®ip n converges to ft in H-Hl 1 ^)- This shows assertion (ii) above. 

To obtain (i) we construct a transport plan irp G n(/i, v) such that J XxY cdnp — 0. Note 
in passing that in view of (ii) we must have II^^IU^iV) = 00 f° r the ir constructed above. 
On the other hand, we must have -4^ G L 1 ^), if > for all fc G N, as every finite cost 
transport plan must be absolutely continuous with respect to n. 

The idea is to concentrate irp on the set 

r : = {(x,y) ■ c(x,y) = 0} 

= {(x,x® ka) : k > 1, Y%=o (^[o^fa ® ia ) ~ © ia )) ^ 

To prove that this can be done it is sufficient to show that whenever A C X, B C Y, 
fi{A) 1 v(B) > 0, a subset A' of A can be transported to a subset B' of B with u(B') = 
n(A') > via r. Then an exhaustion argument applies. 

At this stage we encounter an interesting connection to the theory of measure preserving 
systems. For x G X and m G N set 

S(x, m) := (x ® a,m + t^ i^x) — l^i ^(x) \ . 

Then S is a measure preserving transformation of the space ([0, 1] x Z, A x #). (See |Aar97j 
for an introduction to infinite ergodic theory and the basic definitions in this field.) It is not 
hard to see that the ergodic theorem, applied to the rotation by a on the torus, shows that 
S is non wandering. Much less trivial is the fact that S is also ergodic. This was shown by 
K. Schmidt [Sch78 for a certain class of irrational numbers a G [0, 1), and in full generality 
by M. Keane and J.-P. Conze }CK76j . see also |AK82j . 

The relevance of these facts to our situation is that for k > 1, the pair (x,x © ka) is an 
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element of T if and only if S k (x, 0) G [0, 1) x { — 1, —2, . . .}. By ergodicity of S, there exists 
k such that 

(A x #)((S fc L4 x {0}]) n (B x {-1, -2, . . .})) > 0, 

thus it is possible to shift a positive portion of A to B as required. By exhaustion, there 
indeed exists a transport such that (c, np) = 0. □ 

The above example illustrates some of the subtleties of Theorem l3.ll However, it does not 
yet provide evidence for the necessity of allowing for the singular part h of the optimizer h 
in Theorem l3.ll We have constructed yet a more refined - and rather longish - variant of the 
Ambrosio-Pratelli example above, which shows that, in general, there is no way of avoiding 
these complications in the statement of Theorem 13.11 We refer to the accompanying paper 
BLS09b , Section 3] for a presentation of this example, where it is shown that it can indeed 
occur that the singular part h in Theorem 13.11 does not vanish. 



4. The Projective Limit Theorem 

We again consider the general setting where c is a [0, oo]-valued Borel measurable function. 
To avoid trivialities we shall always assume that n(/x, i>, c) is non-empty. 

Theorem 13.11 only pertains to the situation of a fixed element 7r G n(/z, v, c): one then 
optimizes the transport problem of all 7r G n(/z, v) with II ||l°°(7t ) < °o. 

The purpose of this section is to find an optimizer h which does work simultaneously, for 
all 7To G n(/x, v, c). We are not able to provide a result showing that a function h - plus 
possibly some singular part h s - exists which fulfills this duty, for all n G II(^, v, c). We 
have to leave the question whether this is always possible as an open problem. But we can 
show that a projective limit H — (h v )^ e YHp,v,c) exists which does the job. 

We introduce an order relation on n(/i, v, c) : we say that -k\ < -K2 if tti *C t^2 and 
II IIl oo (7T2) < 00 • F° r 7Ti -< 7T2 there is a natural, continuous projection P wil n 2 : L 1 (n2) — > 
^(tti) associating to each h^ 2 G L 1 ^), which is an equivalence class modulo 7T2-null 
functions, the equivalence class modulo 7Ti-null functions which contains the equivalence 
class h- K2 (and where this inclusion of equivalence classes may be strict, in general). We may 
define the locally convex vector space E as the projective limit 

E = lim L l {X xY,i{). 

< 7r£n(/i,i/,c) 

The elements of E are families H = (h^^^jjdi^^) such that, for tt\ < TT2, we have 

-^7Ti ,7T2 (^7T2 ) ^7Tl • 

A net (H a ) aeI G E converges to H G E if, 

lim || h" - \\l 1 (7t) = 0, for each n G n(^i, v, c). 

We may also define the projective limit 

E s = lim 4(Ixy,ir), 

which is a closed subspace of E. 
We start with an easy result. 

Proposition 4.1. Let X and Y be polish spaces equipped with Borel probability measures 
/i, v, and let c : X x Y — > [0, oo] be Borel measurable. Assume that H(fi, v, c) is non-empty. 
There is itq G n(/i, v 1 c) such that 

pOo) _ inf pW. 

7rfEll(/i,i/,c) 
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Proof. Let (7fn)n!Li be a sequence in Tl(fJ., v, c) such that 

lim = inf P w . 

n— >-oo 7tGII(/j,i/,c) 

It suffices to define ir as 

oo 

n=l 

as we then have tt h ^. ttq, for each b£N. □ 

Of course, if the primal problem (J5|) is attained, we have P^ = P. 
The above proposition allows us to suppose w.l.o.g. in our considerations on the projective 
limit E that the tt appearing in the definition are all bigger than ttq: 

E = lim L 1 (7r)=lim L^tt). 

Clearly, we then have that the optimal transport cost pW is equal to P^°\ for all tt y ttq- 

Theorem 4.2. Let X and Y be polish spaces equipped with Borel probability measures \i, v , 
and let c : X x Y — > [0,oo] be Borel measurable. Assume that v, c) is non-empty. Let 
7To be as in Proposition \4-l\ 

There is an element H = (/lTr)7ren(^,i/,c),7r>--7ro G E such that, for each tt G II (/x, V, c), tt >z 
7To, the element ft, G Lg(n)** satisfies h n < c in the order o/L 1 (7r)** and h n is an optimizer 
of the dual problem (|18l) 

(ft,, tt) = D$ := sup{(ft, 7r) : ft, £ L^(tt)**, ft < c}. 

We tften ftawe iftat, /or each tt G n(/i, z^, c),tt >z ttq, the decomposition ft, — ft^ + ft,* of ft, 
info iis regular and singular parts verifies 

- ft£ G -^s(tt) h^ < c in L 1 (7r); 

- ft* G L-(7r)** and ft* < m £fte space of purely finitely additive measures which are 
absolutely continuous with respect to tt. 

Moreover, for each tt G LT(/i, ^, c),7r X 7To, tftere is no duality gap in the sense that 

D { Z ] = DW = pW = p(to) (29) 

where :— lim sup | J ipd/i + f i/j dv ; cp € ?/> G L 1 ^), f((p © ^! — c)+ rf7r < e| and 

pW := inf{(c, 7r') : 7r' G nM(/i, !/)}. //in addition the primal problem ([2]) is attained, for 
instance if c is lower semicontinuous, then Dil — = pW = P. 

Proof. Fix tt G n(/i, v,c),tt >z ttq. We have seen in Theorem 13.11 that the set 

K v = {ft G Ls(tt)** : ft < c, (ft,7r) = (c,tt)} 

is non-empty. In addition if, is closed and bounded in L x (tt)** and hence compact with 
respect to the cr(L~(7r)**, L~(7r)*)-topology. 
For tt, tt' G II(/i, f, c) with 7r -< tt' the set 

E^^* — E-ii.tt' (E n ' ) 

is contained in if, and still a non-empty cr*-compact convex subset of i 1 (7r)**. By com- 
pactness the following set is <7*-compact and non-empty too: 

i^7r,oc Eft 

We have K w>00 — P,,,'(-Kn-' l00 ) for 7r ^ 7r'. Hence by Tychonoff's theorem the projective 
limit 

lim K Wi00 

< 7rGiI(/^,z^.c) .7r>^7ro 
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of the compact sets (K^^^^y^g is non-empty, which is precisely the main assertion of the 
present theorem. 

Finally, (|29]) is a restatement of (|20|) and when the primal problem ([2|) is attained, the last 
series of equalities follows from P^") = P. □ 

Clearly P ro1 < P < P^ ~> , hence with Theorem O and (J29l) one sees that 
D = P rcl < P < P^ ) = pM = D^J = flW 
for every 7r G i/, c) such that 7r ^ no- 
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